New York always being a fascinating city with many visitors year-round, making tourism a thriving business. Yet Manhattan is still the center or visitors’ people the cost of hotels in Manhattan are quite expensive.
Brooklyn have become trendy with arts and music scene due to the influence of different cultures specially black communities yet some of them are suffering gentrification with the new investments in infrastructure the proximity to Manhattan and access to subway a cheap means of transport it is ideal place to set trendy Bed and Breakfast turning as an example a Brownstone 4 story home into a good business for locals and families instead of rentals.
The idea is to provide a possible investor with the best possible information to decide based on requirements:
1.6km radius from Barclay's Center
Not too crowded with B&B or similar business
With variety of venues such as Cafe's, Restaurants and Pubs.
Non requirements but will give the exercise an added Value.
Make a competitive analysis: if possible, get the description of the competitors on Sq-meters/sq-foot size building.
Check the area under study for properties with similar size and its price
Select a few properties and check its close neighborhood for venues and amenities.
Make a recommendation
We will use the skills and tools learned in IBM-Coursera Specialization to complete with the requirements.
With the problem in hand, we need the following information:
Identify the B&B plus similar business in a radius of 1.6km surrounding the Barclay's Center.
✔ This will fill 1st and 2nd requirement
✔ This will put neighborhoods to "compete" for the investment fulfilling 3rd requirement
Additional information:
✔ Gather Address, Square meter, features and price and Realtor contact.
Sources and tools:
# reduce it to print out to just 10 items remove head to see all 138 venues
dataframe_filtered_bnb_venues.name.head(10)
### need to filter because there are businesses and bus stops that are not part of what i need to analyze
categ_filter = ['Hostel', 'Bed & Breakfast']
dataframe_filtered_bnb = dataframe_filtered_bnb_venues[dataframe_filtered_bnb_venues.categories.isin(categ_filter)]
dataframe_filtered_bnb = dataframe_filtered_bnb.reset_index(drop=True)
dataframe_filtered_bnb
#Brooklyn neighbourhoods of interes with geo-coord.
brook_neigh
brook_merged_short.loc[brook_merged_short['Cluster Labels'] == 0, brook_merged_short.columns[[0] + list(range(5, brook_merged_short.shape[1]))]]
brook_merged_short.loc[brook_merged_short['Cluster Labels'] == 1, brook_merged_short.columns[[0] + list(range(5, brook_merged_short.shape[1]))]]
brook_merged_short.loc[brook_merged_short['Cluster Labels'] == 2, brook_merged_short.columns[[0] + list(range(5, brook_merged_short.shape[1]))]]
brook_merged_short[['neighbourhood', 'postcode', 'Cluster Labels']]
dataframe_filtered_bnb['postalCode'].value_counts()
brook_11205_zillow.head()
brook_11217_zillow.head()
feat_data_11205 = brook_11205_zillow[['type_of_property','price','beds','bath','area_sqft']]
feat_data_11205.hist()
plt.show()
feat_data_11217 = brook_11217_zillow[['type_of_property','price','beds','bath','area_sqft']]
feat_data_11217.hist(color = "green")
plt.show()
#plotting results
plt.scatter(train.price,train.area_sqft, color='blue')
plt.plot(train_x, regr.coef_[0][0]*train_x + regr.intercept_[0], '-r')
plt.xlabel("price")
plt.ylabel('square_ft')
plt.show()
feat_data_11205.corr()['price'].sort_values()
#plotting results
plt.scatter(train.price,train.area_sqft, color='green')
plt.plot(train_x, regr.coef_[0][0]*train_x + regr.intercept_[0], '-r')
plt.xlabel("price")
plt.ylabel('square_ft')
plt.show()
feat_data_11217.corr()['price'].sort_values()
feat_data_11217.groupby(['type_of_property']).mean()
feat_data_11217.groupby(['type_of_property']).mean()['price'].plot.bar(figsize=(12,7), color ='green')
feat_data_11205.groupby(['type_of_property']).mean()
feat_data_11205.groupby(['type_of_property']).mean()['price'].plot.bar(figsize=(12,7))
brook_11217_zillow_filtered
Brooklyn one of the 5 Boroughs of New York City , which has risen to prominence due to investments in the real estate sector and quick transport to Manhattan has become a trendy place for visitors from all over the world, we took the task to find a suitable list of properties that are located close to the Barclay center within 1.6km, with venues of interest and not crowded with BnB/Hostels.
The result of the initial analysis yielded a list of 11 B&B nearby, but mostly concentrated in the Clinton Hill-Bed-Stuy neighborhoods, which gave room to look into adjacent neighborhoods to the Barclay Center in the opposite direction and closer to Manhattan as well close to sites of interest such as museums and parks.
Once identified K-Means technique was performed to find similarities & differences between the neighborhoods, this indeed help solidify the idea of looking for properties in Cluster-0 zip code 11217, their neighborhoods offer similar venues such as Bar's, Cafe's and international restaurants to add into the visitor experience.
The web scrapping on Zillow yielded a list of 100+ for sale properties on each of the 2 Postal Codes under analysis, the 11205 (Clinton hill) and the 11217 (Gowanus, Boerum, Park Slope). Data needed some cleaning and removing outliers with wrong square-foot-area or missing information. The linear regression analysis with remaining data validate the veracity of it, since as square-feet-area increases so the price, once outliers and missing data is cleared out, this gives a good confidence to the business and decision making.
Excluded ultra-expensive and small properties out of the recommendation since it is not suitable for the proposed business. The resulting properties share similar features such as area, beds and price.
With all above completed the result is a recommendation of 9 properties that would be a good fit for a B&B business close to the Barclay Center, the pop-up in the map will take you to the Real-Estate Page of the property
Brooklyn is a vibrant place with lots of venues and multicultural heritage, yet it seems that there are room for Bed&Breakfast business to be developed providing an accessible hospitality option for younger travelers and budget mindful, who spend most of the day visiting the city more than enjoying the hotel/B&B premises.
Using different Data Analysis techniques, it was possible to create a reasonable business proposal and a recommendation to the problem at hand.
K-Means and regression are tools that help tell a story and validate the data to provide an accurate picture of the environment under study.
Tools like this can help people make the best decision for their business.
The techniques, tools and methods learned in the Coursera-IBM help on a possible real-life scenario to generate data and a story to tackle the problem proposed. This exercise could be extended and improved which is part of the journey of becoming a Data Scientist. Very good course, I started from zero and i was able to understand many concepts that are useful in my day to day work.